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SH3D1A 



1 CAAAAGAATT CCGOOT^ C^CICGC^ GGAAGAATOC CGAGCGG3CT 
51 CCGOGACGSA CAG^AGOC^ GKSOKSflG GIUIGCGGGG ClCCGGCICC 
101 igCGTCOCIC CCftGQ3GCGC GPGftQCQGCA CTGATTTOTC CCT33QG0GG 
151 CAQCGCSGAC CCQOCCO^ A^GOIIC GATTAGCAAG OTAAAAGIAA 
201 O^ACCA^ GCICAGTTIC O^CCHT T^IOGCA^ C^^rCT 
251 GGGCCATAAC 1CTAGAGGAA AGAGCGAAGC AT3ATCAGCA CJITCCATAOT 
301 tt^^GOIAA TASC^ 

351 TITICAA.TCr aXTTTACClC A^CO^nTI AGC^CAGATA T03QCACIAG 
401 CT3ACATGSAA TAATGA3K3GA AGAAT3GATC AAOTGGAGIT TTCCAlPiGCr 
451 ATGAAACITA TCAAACT3AA GCTACAAGGA TATC^GCTAC CCTCTGCACT 
501 toqqCCTGTC ATGAAACAQC AACCftGITGC TATTTCTAQC QCAOCAQCAT 
.51 TIGGIIAIOSG AQC7IATGGCC AGCASGCXZAC GQCTTACAGC TOTTGCIOCA 
€0 1 GIQQCAAIGG GATCCATTCC AGTTSrlQGA AT3TCIDCAA CCCI&3EATC 

65i nciannx ao^gcig oaxocox^ gqci^ ccnx^ 

701 ^^^TCK^ 
751 A^nCX^TIS^^ 

801 aCAAAAQGCA. CAGICftTnG ATGIQGCCfiG TCJlCCCflCCA GflGGCAGflST 
851 GMHOlTOaai^^ 

901 CATGftCAAAA CTftiTOflGFIGG ACACTIAAC& GGlOCGCAftG CAAGAACnVI 
951 TCTIATQCAG TCAAGITEAC CACAGGCICA GCT3GCTICA AIAIQGAATC 

1001 TIICiaiDffWn^ 
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1051 GCAATCCACC TCATIGATCT AGCTATGTCT QGCCAACCAC T3CCACCIOT 
1101 CCIGCCICCA GAATACATIC CACCTICITr TAGAAGAGTT CGATCIt33CA 
1151 OroOIMaiC TGICAXAAGC TCAACATCIG TAGATCAGAG GCIACCAGAG 
1201 GAACCAGTIT TAGAAGATGA ACAACAACAA TTAGAAAAGA AATTACCiar 
1251 AACGTTIGAA GAIAAGAAQC QGGAGAACTT TGAACGTOGC AACCT3GAAC 
1301 1QGAGAAAOG AAQGCAAGCT CTCCIQGAAC AGCAGOQCAA GGAQCAGGAG 
1351 CGCCIGGOCC AGCIGGAGCG GGCGGAQCAG GAGAGGAAGG 
1401 CCAGGAGCAA GAGCGCAAAA GACAACTSGA ACT3GAGAAG CAACTOGAAA 
1451 *3CAG2GOGA GCTAGAACGG CAGAGAGAGG AGGAGAQSAG GAAAGAAATT 
1501 GAGAGGCGAG AGGCTGCAAA ACGGGAACTT GAAAGGCAAC C&CAACTIGA 
1551 GT33GAACGG AATCGAAQGC AAGAACTACT AAAICAAAGA AACAAAGAAC 
1601 AAGAGGACAT AGTTG^CIG AAAGCAAAGA AAAAGACXTr GSAATTCGAA 
1651 TTAGAAGCIC TAAA1GATAA AAAGCATCAA CTAGAAGGGA AACTICAAGA 
1701 TATCAGA1GT CXATIGACCA OCCAAAGGCA AGAAATIGAG AGCACAAACA 
1751 AATCTAGAGA GTIGAGAATT GCCGAAATCA CCCATCTACA GCAftCAftTIA 
1801 CAGGAATCIC AGCAAATQCT TQGAAGACTT ATICCAGAAA AACAGAHACT 
1851 CAATCACCAA TTAAAACAAG TICAGCAGAA CAGTITOCAC AGAGATICAC 
1901 TIOTTACACT TAAAAGAGCC TTAGAAGCAA AAGAACTAGC TCGGCAQCAC 
1951 cmCGAGACC AACTGGAT3A AGTGGAGAAA GAAACTAGAT CAAAACIACA 
2001 G3AGATIGAT AXTITCAATA ATCAGCIGAA GGAACTAAGA GAAATACACA 
2051 ATAAGCAACA ACTOCAGAAG CAAAAGTCCA TGGAGGCIGA AGGACTGAAA 
2101 CAGAAAGAAC AAGAACGAAA GATCATAGAA TTAGAAAAAC AAAAAGAAGA 
2151 AGCCCAAAGA CGAGCTCAGG AAAGGGACAA GCAGTIGGCIG GAGCAT3TX3C 
2201 AGCAGGAGGA CGAGCATCAG AGAOCAAGAA AACTCCACGA AGAGGAAAAA 
2251 CIGAAAAQGG AOGftGAGTOT CAAAAAGAAG GAilGQCGftGG AAAAflQGCAA 
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2301 ACAGGAAG2A CAAGACAAGC TGGGT03XT TTICCAICAA CACCAAGAAC 
2351 CAGCIT^AGCC AGCTOTCCAG GCACCCIGGT OCACTQCAGA AAAAGGTCCA 
2401 CTTAOCATTT CTOCACAGSA AAATGTAAAA GiaSraiATT AOD3GQCACT 

2451 gtacoccttt gaatccagaa gccatckiga aatcactatc cagccaggag 

2501 ACAT^GTCAT GGT3GATGAA AGCCAAACTG GAGAACOCGG CIGGCTTGGA 
2551 G3AGAATTAA AAGGAAAGAC CCIQCAAACT AT3CAGAGAA 

2601 AATCCCAGAA AASGAGOTIC COGCKTCAGT GAAACCAGIQ ACIGATICAA 
2651 CATCIGCCCC TCCCCCCAAA CTOCXXTIGC GTGAGACOX CS3CCCCTTTG 
2701 GCAGTAACCT CTICAGAGCC citx^GACc CCTAATAACT GGGCOGACTT 
2751 CAOCJICCACG TG3CCCACCA QCACGAAIGA GAAACCAGAA ACGGATAACI 
280 1 ^T^GGO*^ 

2851 TTAAGGCAGA GGICmZCTT TACTCCAGCC ACOGCCACIG GCTCCICCCC 
2901 GTCTCCTOTG CZftOOQCMG GIGftAAftGGT QGfiGGQGCTA CAAGCICAAG 
2951 OCXTTATKIOC TIQGAGAGCC AAAAAAGACA ACCACTEAAA TITTAACaAA. 
3001 AATGATGTCA TCACCGTCCT OGAACAGCAA GAC^TSGT GOTTIQGAGA 
3051 AGTTCAAG3T CAGAAGOGTT GGTTCOCCAA GTCTEACOIG AMCTCAITT 
3101 CAGGGCOCAT AAGGAAGICT ACAAGCATOG ATK7IGGTIC TTCAGAGAGT 
3151 CCTCCIACIC TAAAQCGAGT AGCCICTZCA GCAGCCAAGC CQGICGITIC 
3201 G3GAGAAGAA ATIGCCCAGG TIATIGOCTC AIACACCQCC ACCGOCCC03 
3251 AGCAGCTCAC TCIOGCCCCT GGTCAGCIGA TTTIGATCCG AAAAAAGAAC 
3301 CCAGGIQGAT GOTO33AAGG AGAGCIGCAA GCACGTGGGA AAAAGOXCA 
3351 GATAGGCIGG TTOOCAGCTA ATTATOIAAA GCTTCTAAGC CCTGGGACGA 
3401 OCAAAATCAC TCCAACAGAG CCAOCTAAGT CAACAGCATT AGOGGCAGT3 
3451 TGCCAGQTCA TTGGGAT3TA OGACTACACC GGGCAGAATG ACGATGAGCT 
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3501 GGCCTICAAC AAGGGCCAGA 1CATX2AAGG7T CCICAACAAG GAGGAOOCTG 

3551 ACTOSIGGAA AGGAGAAOIC AATOGACA2G 1GQGQCTCTT CCCATCCAAT 

3601 TATOIGAAGC TCACCACAGA CA3GGAOCCA AQOCAGCAAT G^'ICAMG 

3651 TIGICCATOC CmriCAGG CTIGAAAGTC CTCAAAGAGA CCCACTATCC 

3701 CATATCACIG OXM&GQGA TCATGGGAGA TOCAGCCTIG ATCATCTOAC 

3751 TTCCAGCATG A7CA0CT&CT GCCTICIGAG TAGAAGAACT CACIGCAGAG 

3801 CAGTTFAGCT CATlTiiAOCT TAGTTCCATG TGATOGCAAT CTITGAOTEA 

3851 TTACTIOCAG AGAIAGGAQC AAAAATTACA AAAACACACA GGGTCAGTCOG 

3901 ' lULTl T iU l G GCTITCCTAG TTACICAAAT TGACTTTOQC OCSyCXTTITCC 

3951 ACAGUIUL'IT TaXAraGITr TAAAATIMT TTEAAATASA TATTTEAGCT 

4001 titiaataaa caaaataaat aaatcactic Triucr/flTT aawrmiscA. 

4051 aaaagaccca ctatcaagga atqctocatg toctaitaaa aatojetcca 

4101 aaiotccata aa1cigagac tigatotatt titicatnt gtq3v3igtt 

4151 aocaactaaa ttoctocagt ttoqqgctct tccxxx7itac catagaagig 

4201 cagaggagit cagiatc1ct gtitiaaaga cgiatagaat gagcocaatt 

4251 aaagcgaagg tgattcjigct to1t1u1u1u tatcagciut accttctiga 

4301 gcatqtaata catccictac ataagaaatt agtictticc aiggcaaagc 

4351 tattacctig tadgatccic taatcaiatt gcaiteaatt tiai'itiuca 

4401 agagigacct toimccaca 1gagaaagca ciuiuiui'it ttottoagtc 

4451 tcagaxitat ciggfttoact 1qototttig tttogogrltt tiaatit1uc 

4501 gicttigcat agcataaaat caot&gacaa cao^cigag gtogteagga 

4551 1CAAGGA3AT CX^CAGTCIC TTITTAGTCr CIGTIACATG AAGTITIATT 

4601 CCAGTEACTT TTCATOGAAT GACL'iAl'ITF GAACAAGTAA TTTICTIGAC 

4651 AAGAAAGAAT GTATAGAAGT CTCOCIQCAA TTAATITCCA AJIGITTACAT 

4701 Tl ' ITlA ACEA GGACTCIQGA A'lTlCTACAG A3TAATATGA AAIQSAGCIC 
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4751 ATGGTCCGTT TGICfTCflTfiG ATAIGCT3IA GCIGAAGCCC TSmOICIT 

4801 TEAABCftCIA GITQGAAGCT CICftATftAAA AT30CT3CIG CICftCAQCRC 

4851 AGAAAftOTGG GCAQQC3QGAG CCICAAQCftC AATCISGCIG TCCTOCTAAA 

4901 GACICICTAA TCCICAATCC CCITCCGTTIC TCCEOGOGCT CTOGGGftGGC 

49S1 TOrocroGiG aiojiOiiftsv QGficcrrnc ctitcaaatc oracacspcws 

5001 A3ftGG£CCTT TCCICCTIGT TCAOITOCAA TICAGTATrr TCSCGGMKr 

5051 GAATCTAAAA TATTCAAAIA IfcOfcAACCIG MGATTEftAC AAATOTRAAA 

5101 CAACCTITIG AMTSOTICC GftOTATRGAT AAXISAATTT TE&AAACAAA 

5151 AOTAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAACTOGftC G0Q0CO30S 
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SH3D1A Translated Protein Sequence: 

1 MftQFETSTOG SLDIWAIIVE ERAKHEQQFH SLKPISGFIT GDQAKNFFFQ 

51 SGLPQFv'LAQ IWALAEMNND GRMDQVEFSI AMKLIKLKLQ GV^SALPP 

101 VMKQQPVAIS SAPAFGM3GI ASMPPLTAVA PVPM3SIPW GMSPTLVSSV 

151 PTAAVPPLftM GAPPV1QPLP AFAHPAATLP KSSSFSRSGP GSQUsTCKLQK 

201 AQSFDVfcSVP PVAEWHVPQS SRLKYHQLFN SHDKIMSGHL IGPQARTHM 

251 QSSLFQAQLA SIVWLSDIDQ DGKLTAEEFT LftMHLIDVaM SQQPLPPVLP 

301 PEXIPPSFRR VRSGSGISVI SSTSVDQRLP EEPVLEDBQQ QLEKKLPVTF 

351 EDKKRENFER GNE£X£KRBQ ALLEQ3RKEQ ERLAQLERAE QERKEMPQE 

401 QEBKRQLELE KQLEKQRELE RQREEERRKE IERREAAKRE LEPQRQLEWE 

451 SNREQELLNQ KNKEQEDIVV LKAKKKTLEF ELEALNDKKH QLEGKLQDXR 

501 CRLTT2PQEI ESONKSRELR IAEE1HLQQQ LGESQQMLGR LIPEKQItUD 

551 QLKQVQ9BL HRDSLVTLKR ALEAKELARQ HLRD3LDEVE K27TOSKLQEI 

601 DIFKN3LKEL RELHNKQQLQ KQKSMEAERL KQKBQERKII ELEHQKEERQ 
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651 KRAQEREKQW LEHVSQEEEH QRPKKLHEEE KLKREESVKK KDGEEKGKQE 

701 AQCKLGKLFH QHQEPAKPAV QAEWSEAEXG PLT1S»3ENV KWYVPALYP 

751 FESRSHDKET IQFGD2VMVD ESQfTOEPGWL GGELKGK3GW FPANYAEKXP 

801 ajEVPAPVK? VTDSTSAPAP KLALRETPAP LAVTSSEPST TPNNWM2FSS 

851 1WPTSTOEKP ETENWDAWAA QPSLTVPSAG QLPQRSAPTP ATATGSSPSP 

901 VIX3QGEKVB3 LQ^CALYFWR AKKENHIHEN KNDVITVLEQ QCMtfiFGEVQ 

951 (33KGWFPKSY VKLISGPIKK OTSMDSGSSE SPASLtfRVAS PAAKPWSGE 

1001 ETAQVIASYT ATCPBQWIA PGQLILIRKK NFGGWW03EL QAR3KKRQIG 

10S1 WFPANYVKLL SPOISKXTPT EPEKSTMAA VCQVIQflD? TAQNCCEXAF 

1101 NW3QHNVLN KEDEDWWKGE VNQQVGLFPS NYVKLTHMD PSQ 
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1 GCACGAGAGG GAGCGAAGGA GGTAGAGAAG AGTGGAGGCG CCAGGGGAGG 
51 GAGCGTAGCT TGGTTGCTCC GTAGTACGGC GGCTCGCGAG GAAGAATCCC 
101 GAGCGGGCTC CGGGACGGAC AGAGAGGCGG GCGGGGATGG TGTGCGGGGC 
151 TGCGGCTCCT GCGTCCCTCC CAGCGGCGCG TGAGCGGCAC TGATTTGTCC 
20 1 CTGGGGCGGC AGCGCGGACC CGCCCGGAG A TGAGGCGTCG ATTAGCAAGG 
25 1 TAAAAGTAAC AGAACCATGG CTCAGTTTCC AACACCTTTT GGTGGCAGCC 
30 i TGGATATCTG GGCC ATAACT GTAG AGGAAA GAGCGAAGCA TGATCAGCAG 
35 1 TTCCATAGTT TAAAGCCAAT ATCTGGATTC ATTACTGGTG ATCAAGCTAG 
40 1 AAACTTTTTT TTTC AATCTG GGTTACCTCA ACCTGTTTTA GCACAGATAT 
45 1 GGGCACTAGC TGACATGAAT AATGATGGAA G AATGGATCA AGTGGAGTTT 
501 TCCATAGCTA TGAAACTTAT CAAACTGAAG CTACAAGGAT ATCAGCTACC 
55 1 CTCTGCACTT CCCCCTGTCA TGAAACAGCA ACCAGTTGCT ATTTCTAGCG 
601 CACCAGCATT TGGTATGGGA GGTATCGCCA GCATGCCACC GCTTACAGCT 
65 i GTTGCTCCAG TGCCAATGGG ATCCATTCC A GTTGTTGGAA TGTCTCCAAC 
70 1 CCTAGTATCT TCTGTTCCCA CAGCAGCTGT GCCCCCCCTG GCTAACGGGG 
75 1 CTCCCCCTGT TATACAACCT CTGCCTGCAT TTGCTCATCC TGCAGCCACA 
80 1 TTGCC AAAGA GTTCTTCCTT TAGTAGATCT GGTCC AGGGT CACAACTAAA 
85 1 CACTAAATTA CAAAAGGCAC AGTCATTTGA TGTGGCCAGT GTCCCACCAG 
901 TGGCAGAGTG GGCTGTTCCT CAGTCATCAA GACTGAAATA CAGGCAATTA 
95 1 TTCAATAGTC ATGACAAAAC TATGAGTGGA CACTTAACAG GTCCCCAAGC 
1001 AAGAACTATT CTTATGCAGT CAAGTTTACC ACAGGCTCAG CTGGCTTCAA 
105 1 TATGGAATCT TTCTGACATT GATCAAGATG GAAAACTTAC AGCAGAGGAA 
1101 TTTATCCTGG CAATGCACCT CATTGATGTA GCTATGTCTG GCCAACCACT 
1151 GCCACCTGTC CTGCCTCCAG AATACATTCC ACCTTCTTTT AGAAGAGTTC 
1 20 1 GATCTGGC AG TGGTATATCT GTC ATAAGCT C AACATCTGT AGATCAGAGG 
125 1 CTACCAGAGG AACCAGTTTT AGAAGATGAA CAACAACAAT TAGAAAAGAA 
1301 ATTACCTGTA ACGTTTGAAG ATAAGAAGCG GGAGAACTTT GAACGTGGCA 
135 1 ACCTGGAACT GGAGAAACGA AGGCAAGCTC TCCTGGAACA GCAGCGCAAG 
1401 GAGCAGGAGC GCCTGGCCCA GCTGGAGCGG GCGGAGCAGG AGAGGAAGGA 
1451 GCGTGAGCGC CAGGAGCAAG AGCGCAAAAG ACAACTGGAA CTGGAGAAGC 
1501 AACTGGAAAA GCAGCGGGAG CTAGAACGGC AGAGAGAGGA GGAGAGGAGG 
1551 AAAGAAATTG AGAGGCGAGA GGCTGCAAAA CGGG AACTTG A AAGGC AACG 
1 60 1 AC AACTTG AG TGGG AACGG A ATCG AAGGC A AG A ACTACTA AATCAAAG AA 
1651 ACAAAGAACA AGAGGACATA GTTGTACTGA AAGCAAAGAA AAAGACTTTG 
1 70 1 GAATTTGAAT TAG AAGCTCT AAATG ATAAA AAGCATC AAC TAG AAGGG AA 
175 1 ACTTCAAGAT ATCAGATGTC GATTGACCAC CCAAAGGCAA GAAATTGAGA 
1801 GCACAAACAA ATCTAGAGAG TTGAGAATTG CCGAAATCAC CCATCTACAG 
1 85 1 CAACAATTAC AGGAATCTCA GCAAATGCTT GGAAGACTTA TTCCAGAAAA 
1 90 1 AC AG ATACTC AATGACCAAT TAAAACAAGT TC AGC AGAAC AGTTTGC AC A 
1 95 1 GAG ATTC ACT TGTTAC ACTT AAAAGAGCCT TAGAAGCAAA AGAACTAGCT 
2001 CGGCAGCACC TACGAGACCA ACTGGATGAA GTGGAGAAAG AAACTAGATC 
2051 AAAACTACAG GAGATTGATA TTTTCAATAA TCAGCTGAAG GAACTAAGAG 
2101 AAATAC AC AA TAAGC A AC A A CTCC AG AAGC AAAAGTCC AT GG AGGCTG AA 
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2151 CGACTGAAAC AG AAAG AAC A AGAACGAAAG ATCATAG AAT TAGAAAAACA 
2201 AAAAGAAGAA GCCCAAAGAC GAGCTCAGGA AAGGGACAAG CAGTGGCTGG 
2251 AGCATGTGCA GCAGGAGGAC GAGCATCAGA GACCAAGAAA ACTCCACGAA 
2301 GAGGAAAAAC TGAAAAGGGA GGAGAGTGTC AAAAAGAAGG ATGGCGAGGA 
235 1 AAAAGGCAAA CAGGAAGCAC AAGACAAGCT GGGTCGGCTT TTCCATCAAC 
240 1 ACCAAGAACC AGCTAAGCCA GCTGTCCAGG CACCCTGGTC CACTGCAGAA 
245 1 AAAGGTCCAC TTACCATTTC TGCACAGGAA AATGTAAAAG TGGTGTATTA 
250 1 CCGGGCACTG TACCCCTTTG AATCCAGAAG CCATGATGAA ATCACTATCC 
255 1 AGCCAGGAGA CATAGTCATG GTTAAAGGGG AATGGGTGGA TGAAAGCCAA 
260 1 ACTGGAGAAC CCGGCTGGCT TGGAGGAGAA TTAAAAGGAA AGACAGGGTG 
265 1 GTTCCCTGCA AACTATGCAG AGAAAATCCC AGAAAATGAG GTTCCCGCTC 
270 1 CAGTGAAACC AGTGACTGAT TCAACATCTG CCCCTGCCCC CAAACTGGCC 
275 1 TTGCGTGAGA CCCCCGCCCC TTTGGCAGTA ACCTCTTCAG AGCCCTCCAC 
280 1 GACCCCTAAT AACTGGGCCG ACTTCAGCTC CACGTGGCCC ACCAGCACGA 
W 2851 ATGAGAAACC AGAAACGGAT AACTGGGATG CATGGGCAGC CCAGCCCTCT 

J" 2901 CTCACCGTTC CAAGTGCCGG CCAGTTAAGG CAGAGGTCCG CCTTTACTCC 

2\ 295 1 AGCCACGGCC ACTGGCTCCT CCCCGTCTCC TGTGCTAGGC CAGGGTGAAA 

ijJ 300 1 AGGTGGAGGG GCTACAAGCT CAAGCCCTAT ATCCTTGGAG AGCCAAAAAA 

M 305 1 GACAACCACT TAAATTTTAA CAAAAATGAT GTCATCACCG TCCTGGAACA 

*M 3101 GCAAGACATG TGGTGGTTTG GAGAAGTTCA AGGTCAGAAG GGTTGGTTCC 
U 3151 CCAAGTCTTA CGTGAAACTC ATTTCAGGGC CCATAAGGAA GTCTACAAGC 
320 1 ATGG ATTCTG GTTCTTCAGA GAGTCCTGCT AGTCTAAAGC GAGTAGCCTC 
325 1 TCCAGCAGCC AAGCCGGTCG TTTCGGGAGA AGAATTTATT GCCATGTACA 
C3 330 1 CTTACG AGAG TTCTGAGCAA GGAGATTTAA CCTTTCAGCA AGGGGATGTG 
uh 335 1 ATTTTGGTTA CCAAGAAAGA TGGTGACTGG TGGACAGGAA CAGTGGGCGA 
q 340 1 CAAGGCCGG A GTCTTCCCTT CTAACTATGT GAGGCTTAAA GATTCAGAGG 
f] J 345 1 GCTCTGGAAC TGCTGGGAAA ACAGGGAGTT TAGGAAAAAA ACCTGAAATT 
q 350 1 GCCC AGGTTA TTGCCTCATA CACCGCCACC GGCCCCGAGC AGCTCACTCT 
hh 355 1 CGCCCCTGGT CAGCTGATTT TGATCCGAAA AAAGAACCCA GGTGGATGGT 
3601 GGGAAGGAGA GCTGCAAGCA CGTGGGAAAA AGCGCCAGAT AGGCTGGTTC 
365 1 CCAGCTAATT ATG TAAAGCT TCTAAGCCCT GGGACGAGCA AAATCACTCC 
3701 AACAGAGCCA CCTAAGTCAA CAGCATTAGC GGCAGTGTGC CAGGTGATTG 
3751 GGATGTACGA CTACACCGCG CAGAATGACG ATGAGCTGGC CTTCAAC AAG 
3801 GGCCAGATCA TCAACGTCCT CAACAAGGAG GACCCTGACT GGTGGAAAGG 
3851 AGAAGTCAAT GGACAAGTGG GGCTCTTCCC ATCCAATTAT GTGAAGCTGA 
3901 CCACAGACAT GGACCCAAGC CAGCAATGAA TCATATGTTG TCCATCCCCC 
395 1 CCTCAGGCTT GAAAGTCCTC AAAGAGACCC ACTATCCCAT ATCACTGCCC 
4001 AGAGGGATGA TGGGAGATGC AGCCTTGATC ATGTGACTTC CAGCATGATC 
4051 ACCTACTGCC TTCTGAGTAG AAGAACTCAC TGCAGAGCAG TTTACCTCAT 
4101 TTTACCTTAG TTGCATGTGA TCGCAATGTT TGAGTTATTA CTTGCAGAGA 
4151 TAGGAGCAAA AATTACAAAA ACACACAGGG TAGTGGGTCC TTTTGTGGCT 
4201 TTCC TAGTT A CTCAA ATTGA CTTTCCCCCA CCTTTGCACA GGTGCTTTCA 
425 1 ATAGTTTTAA AATTATTTTT AAATATATAT TTTAGCTTTT TAATAAACAA 
4301 AATAAATAAA TGACTTCTTT GCTATTTTGG TTTTGCAAAA AGACCCACTA 
435 1 TCAAGGAATG CTGCATGTGC TATTAAAAAT TGTTCCAAAT GTCCATAAAT 
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440 1 CTGAG ACTTG ATGTATTTTT TC ATTTTGTC C AGTGTTACC AACTAAATTG 
4451 TGCAGTTTGG GGCTTTTCCC CCTTACCATA GAAGTGCAGA GGAGTTCAGT 
4501 ATCTCTGTTT TAAAGACGTA TAGAATGAGC CCAATTAAAG CGAAGGTGTT 
455 1 TGTGCTTGTT TGTGTGTATC AGCTGTACCT TGTTGAGCAT GTAATACATC 
4601 CTGTACATAA GAAATTAGTT CTTTCCATGG CAAAGCTATT ACCTTGTACG 
465 1 ATGCTCTAAT C ATATTGC AT TTAATTTTAT TTTGCACAGT GACCTTGTAG 
4701 CCACATGAGA AAGCACTCTG TGTTTTTGTT CGGTCTCAGA TTTATCTGGT 
475 1 TG AGTTGGTG TTTTGTTTGG GGTTTTTAAT TTTGCGTGTT TGCATAGCAT 
4801 AAAATCAGTA GACAACACCA CTGAGGTCGT TACGATCAAC GATATCCACA 
485 1 GTCTCTTTTT AGTCTCTGTT ACATGAAGTT TTATTCCAGT TACTTTTCAT 
490 i GGAATG ACCT ATTTTG AACA AGTAATTTTC TTGACAAGAA AG AATGTATA 
4951 GAAGTCTCCC TGCAATTAAT TTCCAATGTT TACATTTTTT AACTAGACTG 
C3 500 1 TGGAATTTCT ACAGATTAAT ATGAAATGGA GCTCATGGTC CGTTTGTGTG 
y3 505 1 TTAGATATGC TGTAGCTGAA GCCCTGTTTG TCTTTTAAAC ACTAGTTGGA 
N 5101 AGCTCTC AAT AAAAATGCCT GCTGCTC AC A GC ACAGAAAA TGGGGCAGGG 
D j 5151 GGAGCCTCAA GC AC AATCTA GCTGTCCTCC TAAAGACTCT GTAATGCTCA 
C3 5201 CTCCCCTCGC GTTCTCCCGG CGCTGTCGGG AGGCTGTGCT GGTGGTCGTG 
M3 525 1 TAGAGGTCCT TCTCCTTTCA CATGGTGCAG AGAGCGAGGA CCTCTCCTCC 
LJ 530 1 TCGTTCAGTT GCACTTCAGT ATTTTCACGG ATATGAATGT AAAATATATA 
J3 5351 AATATATAAA CCTGCGGCTT TAACAACTGT AATACAACCT TTTGAATTAG 
s 540 1 TTCCGTGTAT AGATAATTAA ATTCTTCATA CAAAAGTTAA AAAAAAAAAA 
h 5451 AAAAAAAA 
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#21 translated protein sequence: 

1 MAQFPTPFGG SLDIWAITVE ERAKHDQQFH SLKPISGFIT GDQARNFFFQ 
51 SGLPQPVLAQ IWALADMNND GRMDQVEFSI AMKLIKLKLQ GYQLPSALPP 
101 VMKQQPVAIS SAPAFGMGGI ASMPPLTAVA PVPMGSrPW GMSPTLVSSV 
151 PTAAVPPLAN GAPPVIQPLP AFAHPAATLP KSSSFSRSGP GSQLNTKLQK 
20 1 AQSFDVASVP PVAE WAVPQS SRLKYRQLFN SHDKTMSGHL TGPQARTTLM 
251 QSSLPQAQLA SIWNLSDIDQ DGKLTAEEFI LAMHLIDVAM SGQPLPPVLP 
301 PEYIPPSFRR VRSGSGISVI SSTSVDQRLP EEPVLEDEQQ QLEKJCLPVTF 
35 1 EDKKRENFER GNLELEKRRQ ALLEQQRKEQ ERLAQLERAE QERKERERQE 
401 QERKRQLELE KQLEKQRELE RQREEERRKE IERREAAKRE LERQRQLEWE 
45 1 RNRRQELLNQ RNKEQEDIVV LKAKKKTLEF ELEALNDKKH QLEGKLQDIR 
501 CRLTTQRQEI ESTNKSRELR IAEITHLQQQ LQESQQMLGR LIPEKQILND 
55 1 QLKQVQQNSL HRDSLVTLKR ALEAKELARQ HLRDQLDEVE KETRSKLQEI 
P 601 DIFNNQLKEL REIHNKQQLQ KQKSMEAERL KQKEQERKII ELEKQKEEAQ 
M3 65 1 RRAQERDKQ W LEHVQQEDEH QRPRKLHEEE KLKREESVKK ICDGEEKGKQE 
N 70 1 AQDKLGRLFH QHQEPAKPAV QAPWSTAEKG PLTISAQENV KWYYRALYP 
p J 75 1 FESRSHDEIT IQPGDIVMVK GEWVDESQTG EPGWLGGELK GKTGWFPANY 
p 80 1 AEKIPENEVP APVKP VTDST SAPAPKLALR ETPAPLAVTS SEPSTTPNNW 

85 1 ADFSSTWPTS TNEKPETDNW DAWAAQPSLT VPSAGQLRQR SAFTPATATG 
Uj 901 SSPSPVLGQG EKVEGLQAQA LYPWRAKKDN HLNFNKNDVI TVLEQQDMWW 

95 1 FGEVQGQKGW FPKSYVKLIS GPIRKSTSMD SGSSESPASL KRVASPAAKP 
5 " 1001 WSGEEFIAM YTYESSEQGD LTFQQGDVIL VTKKDGDWWT GTVGDKAGVF 
q 1051 PSNYVRLKDS EGSGTAGKTG SLGKKPEIAQ VIASYTATGP EQLTLAPGQL 
i^l 1101 ILIRKKNPGG WWEGELQARG KKRQIG WFPA NYVKLLSPGT SKITPTEPPK 
~ 1151 STALAA VCQV IGMYDYTAQN DDELAFNKGQ IINVLNKEDP D WWKGEVNGQ 
SI 1201 VGLFPSNYVK LTTDMDPSQQ * 

Q 
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Whole protein sequence 

1 TRGSEGGREE WRRQGRERSL VAP*YGGSRG R1PSGLRDGQ RGGRGWCAGL 
51 RLLRPSQRRV SGTDLSLGRQ RGPARR*GVD * QGKSNRTMA QFPTPFGGSL 
101 DIWAITVEER AKHDQQFHSL KPISGFITGD QARNFFFQSG LPQPVLAQIW 
151 ALADMNNDGR MDQVEFSIAM KLIKLKLQGY QLPSALPPVM KQQPVAISSA 
201 PAFGMGGIAS MPPLTAVAPV PMGSIPVVGM SPTLVSSVPT AAVPPLANGA 
251 PPVIQPLPAF AHPAATLPKS SSFSRSGPGS QLNTKLQKAQ SFDVASVPPV 
301 AEWAVPQSSR LKYRQLFNSH DKTMSGHLTG PQARTILMQS SLPQAQLASI 
35 1 WNLSDIDQDG KLTAEEFILA MHLIDVAMSG QPLPPVLPPE YIPPSFRRVR 
401 SGSGISVISS TSVDQRLPEE PVLEDEQQQL EKKLPVTFED KKRENFERGN 
451 LELEKRRQAL LEQQRKEQER LAQLERAEQE RKERERQEQE RKRQLELEKQ 
« 501 LEKQRELERQ REEERRKEIE RREAAKRELE RQRQLEWERN RRQELLNQRN 
~; 55 1 KEQEDI WLK AKKKTLEFEL E ALNDKKHQL EGKLQDIRCR LTTQRQEIES 
601 TNKSRELRIA EITHLQQQLQ ESQQMLGRLI PEKQILNDQL KQVQQNSLHR 
651 DSLVTLKRAL EAKELARQHL RDQLDEVEKE TRSKLQEIDI FNNQLKELRE 
701 IHNKQQLQKQ KSMEAERLKQ KEQERKIIEL EKQKEEAQRR AQERDKQWLE 
M 75 1 HVQQEDEHQR PRKLHEEEKL KREESVKKKD GEEKGKQEAQ DKLGRLFHQH 
^3 801 QEPAKPAVQA PWSTAEKGPL TISAQENVKV VYYRALYPFE SRSHDEITIQ 
W 85 1 PGDIVMVKGE WVDESQTGEP G WLGGELKGK TGWFPANYAE KIPENEVPAP 
JS 901 VKPVTDSTSA PAPKLALRET PAPLAVTSSE PSTTPNNWAD FSSTWPTSTN 
s . 951 EKPETDNWDA WAAQPSLTVP SAGQLRQRSA FTPATATGSS PSPVLGQGEK 
p 1 00 1 VEGLQAQALY PWRAKKDNHL NFNKNDVITV LEQQDMWWFG EVQGQKGWFP 
M 1 05 1 KS Y VKLISGP IRKSTSMDSG SSESPASLKR V ASPAAKPV V SGEEFIAMYT 
Q 1101 YESSEQGDLT FQQGDVILVT KKDGD WWTGT VGDKAGVFPS NYVRLKDSEG 
n J 1151 SGTAGKTGSL GKKPEIAQVI AS YTATGPEQ LTLAPGQLIL IRKKNPGG WW 
fjj 1201 EGELQARGKK RQIGWFPANY VKLLSPGTSK ITPTEPPKST ALAAVCQVIG .. 
T\ 125 1 MYDYTAQNDD ELAFNKGQII NVLNKEDPDW WKGEVNGQVG LFPSNYVKLT 
1301 TDMDPSQQ*! ICCPSPPQA* KS S KRPTIP Y HC PEG * WEMQ P*SCDFQHDH 
135 1 LLPSE*KNSL QSSLPHFTLV ACDRNV*VIT CRDRSKNYKN TQGSGSFCGF 
1401 PSYSN*LSPT FAQVLSIVLK LFLNIYFSFL FNKINK* LLC YFGF AKRPTI 
145 1 KECCMCY*KL FQMSINLRLD VFFHFVQCYQ LNCAVWGFSP LP*KCRGVQY 
1501 LCFKDV*NEP N*SEGVCACL CVSAVPC*AC NTSCT*EISS FHGKAITLYD 
155 1 ALIILHLILF CTVTL*PHEK ALCVFVRSQI YLVELVFCLG FLILRVCIA* 
1601 NQ*TTPLRSL RSTISTVSF* SLLHEVLFQL LFME*PILNK *FS*QERMYR 
1651 SLPAINFQCL HFLTRLWNFY RLI*NGAHGP FVC*ICCS*S PVCLLNTSWK 
1701 LSIKMPAAHS TENGAGGASS TI*LSS*RLC NAHSPRVLPA LSGGCAGGRV 
1751 EVLLLSHGAE SEDLSSSFSC TSVFSRI*M* NI*IYKPAAL TTVIQPFELV 
1801 PCIDN*ILHT KVKKKKKK 
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1 AGAGTGGAGG CGCCAGGGGA GGGAGCGTAG CTTGGTTGCT CCGTAGTACG 
51 GCGGCTCGCG AGGAAGAATC CCGAGCGGGC TCCGGGACGG ACAGAGAGGC 
101 GGGCGGGGAT GGTGTGCGGG GCTGCGGCTC CTGCGTCCCT CCC AGCGGCG 
151 CGTGAGCGGC ACTGATTTGT CCCTGGGGCG GCAGCGCGGA CCCGCCCGGA 
201 GATGAGGCGT CGATTAGCAA GGTAAAAGTA ACAGAACCAT GGCTCAGTTT 
25 1 CCAACACCTT TTGGTGGCAG CCTGGATATC TGGGCCATAA CTGTAGAGGA 
301 AAGAGCGAAG CATGATCAGC AGTTCCATAG TTTAAAGCCA ATATCTGGAT 
35 1 TCATTACTGG TGATCAAGCT AGAAACTTTT TTTTTCAATC TGGGTTACCT 
401 CAACCTGTTT TAGCACAGAT ATGGGCACTA GCTGACATGA ATAATGATGG 
45 1 AAGAATGGAT CAAGTGGAGT TTTCCATAGC TATG AAACTT ATCAAACTGA 
501 AGCTACAAGG ATATCAGCTA CCCTCTGCAC TTCCCCCTGT CATGAAACAG 
55 1 CAACCAGTTG CTATTTCTAG CGCACC AGCA TTTGGTATGG GAGGTATCGC 
601 CAGCATGCCA CCGCTTACAG CTGTTGCTCC AGTGCCAATG GGATCCATTC 
65 1 CAGTTGTTGG AATGTCTCCA ACCCTAGTAT CTTCTGTTCC CACAGCAGCT 
701 GTGCCCCCCC TGGCTAACGG GGCTCCCCCT GTTATACAAC CTCTGCCTGC 
75 1 ATTTGCTCAT CCTGCAGCCA CATTGCCAAA GAGTTCTTCC TTTAGTAGAT 
801 CTGGTCCAGG GTCACAACTA AACACTAAAT TACAAAAGGC ACAGTCATTT 
85 1 GATGTGGCCA GTGTCCCACC AGTGGCAGAG TGGGCTGTTC CTCAGTCATC 
901 AAGACTGAAA TACAGGCAAT TATTCAATAG TCATGACAAA ACTATGAGTG 
95 1 GACACTTAAC AGGTCCCCAA GCAAGAACTA TTCTTATGCA GTCAAGTTTA 
1001 CCACAGGCTC AGCTGGCTTC AATATGGAAT CTTTCTGACA TTGATCAAGA 
105 1 TGGAAAACTT ACAGCAGAGG AATTTATCCT GGCAATGCAC CTCATTGATG 
1101 TAGCTATGTC TGGCCAACCA CTGCCACCTG TCCTGCCTCC AGAATACATT 
1151 CCACCTTCTT TTAGAAGAGT TCGATCTGGC AGTGGTATAT CTGTCATAAG 
1201 CTCAACATCT GTAGATCAGA GGCTACGAGA GGAACCAGTT TTAGAAGATG 
125 1 AACAACAACA ATTAGAAAAG AAATTACCTG TAACGTTTGA AGATAAGAAG 
1301 CGGG AG A ACT TTGAACGTGG CAACCTGGAA CTGGAGAAAC GAAGGCAAGC 
1351 TCTCCTGGAA CAGCAGCGCA AGGAGCAGGA GCGCCTGGCC CAGCTGGAGC 
1401 GGGCGGAGCA GG AG AGGAAG GAGCGTGAGC GCCAGGAGCA AGAGCGCAAA 
1451 AGACAACTGG AACTGGAGAA GCAACTGGAA AAGCAGCGGG AGCTAGAACG 
1501 GCAGAGAGAG GAGGAGAGGA GGAAAGAAAT TGAGAGGCGA GAGGCTGCAA 
1551 AACGGGAACT TGAAAGGCAA CGACAACTTG AGTGGGAACG GAATCGAAGG 
1601 CAAGAACTAC TAAATCAAAG AAACAAAGAA CAAGAGGACA TAGTTGTACT 
165 1 GAAAGCAAAG AAAAAG ACTT TGGAATTTGA ATTAGAAGCT CTAAATGATA 
1701 AAAAGCATCA ACTAGAAGGG AAACTTCAAG ATATCAGATG TCGATTGACC 
1751 ACCCAAAGGC AAGAAATTGA GAGCACAAAC AAATCTAGAG AGTTGAGAAT 
1801 TGCCGAAATC ACCCATCTAC AGCAACAATT ACAGGAATCT CAGCAAATGC 
1851 TTGGAAGACT TATTCCAGAA AAACAGATAC TCAATGACCA ATTAAAACAA 
1901 GTTCAGCAGA ACAGTTTGCA CAGAGATTCA CTTGTTACAC TTAAAAGAGC 
195 1 CTTAGAAGCA AAAGAACTAG CTCGGCAGCA CCTACGAGAC CAACTGGATG 
2001 AAGTGGAGAA AGAAACTAGA TCAAAACTAC AGGAGATTGA TATTTTCAAT 
205 1 AATCAGCTGA AGGAACTAAG AGAAATACAC AATAAGCAAC AACTCCAGAA 
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2101 GCAAAAGTCC ATGGAGGCTG AACGACTGAA ACAGAAAGAA CAAGAACGAA 
2151 AGATCATAGA ATTAGAAAAA CAAAAAGAAG AAGCCCAAAG ACGAGCTCAG 
2201 GAAAGGGACA AGCAGTGGCT GGAGCATGTG CAGCAGGAGG ACGAGCATCA 
225 1 GAG ACCAAGA AAACTCCACG AAGAGGAAAA ACTGAAAAGG GAGGAGAGTG 
2301 TCAAAAAGAA GGATGGCGAG GAAAAAGGCA AACAGGAAGC ACAAGACAAG 
235 1 CTGGGTCGGC TTTTCCATCA ACACCAAGAA CCAGCTAAGC CAGCTGTCCA 
2401 GGCACCCTGG TCCACTGCAG AAAAAGGTCC ACTTACCATT TCTGCACAGG 
245 1 AAAATGTAAA AGTGGTGTAT TACCGGGCAC TGTACCCCTT TGAATCCAGA 
2501 AGCCATGATG AAATCACTAT CCAGCCAGGA GACATAGTCA TGGTGGATGA 
2551 AAGCCAAACT GGAGAACCCG GCTGGCTTGG AGGAGAATTA AAAGGAAAGA 
2601 CAGGGTGGTT CCCTGCAAAC TATGCAGAGA AAATCCCAGA AAATGAGGTT 
265 1 CCCGCTCCAG TGAAACCAGT GACTGATTCA ACATCTGCCC CTGCCCCCAA 
2701 ACTGGCCTTG CGTGAGACCC CCGCCCCTTT GGCAGTAACC TCTTCAGAGC 
275 1 CCTCCACGAC CCCTAATAAC TGGGCCGACT TCAGCTCCAC GTGGCCCACC 
2801 AGCACGAATG AGAAACCAGA AACGGATAAC TGGGATGCAT GGGCAGCCCA 
2851 GCCCTCTCTC ACCGTTCCAA GTGCCGGCCA GTTAAGGCAG AGGTCCGCCT 
2901 TTACTCCAGC CACGGCCACT GGCTCCTCCC CGTCTCCTGT GCTAGGCCAG 
295 1 GGTGAAAAGG TGGAGGGGCT ACAAGCTCAA GCCCTATATC CTTGGAGAGC 
3001 CAAAAAAGAC AACCACTTAA ATTTTAACAA AAATGATGTC ATCACCGTCC 
3051 TGGAACAGCA AGACATGTGG TGGTTTGGAG AAGTTCAAGG TCAGAAGGGT 
3101 TGGTTCCCCA AGTCTTACGT GAAACTCATT TCAGGGCCCA TAAGGAAGTC 
3151 TACAAGCATG GATTCTGGTT CTTCAGAG AG TCCTGCTAGT CTAAAGCGAG 
3201 TAGCCTCTCC AGCAGCCAAG CCGGTCGTTT CGGGAGAAGA ATTTATTGCC 
325 1 ATGTACACTT ACGAGAGTTC TGAGCAAGGA GATTTAACCT TTCAGCAAGG 
3301 GGATGTGATT TTGGTTACCA AGAAAGATGG TGACTGGTGG ACAGGAACAG 
3351 TGGGCGACAA GGCCGGAGTC TTCCCTTCTA ACTATGTGAG GCTTAAAGAT 
3401 TCAGAGGGCT CTGGAACTGC TGGGAAAACA GGGAGTTTAG GAAAAAAACC 
3451 TGAAATTGCC CAGGTTATTG CCTCATACAC CGCCACCGGC CCCGAGCAGC 
3501 TCACTCTCGC CCCTGGTCAG CTGATTTTGA TCCGAAAAAA GAACCCAGGT 
3551 GGATGGTGGG AAGGAGAGCT GCAAGCACGT GGGAAAAAGC GCCAGATAGG 
3601 CTGGTTCCCA GCTAATTATG TAAAGCTTCT AAGCCCTGGG ACGAGCAAAA 
3651 TCACTCCAAC AGAGCCACCT AAGTCAACAG CATTAGCGGC AGTGTGCCAG 
3701 GTGATTGGGA TGTACGACTA CACCGCGCAG AATGACGATG AGCTGGCCTT 
3751 CAACAAGGGC CAGATCATCA ACGTCCTCAA CAAGGAGGAC CCTGACTGGT 
3801 GGAAAGGAGA AGTCAATGGA CAAGTGGGGC TCTTCCCATC CAATTATGTG 
3851 AAGCTGACCA CAGACATGGA CCCAAGCCAG CAATGAATCA TATGTTGTCC 
3901 ATCCCCCCCT CAGGCTTGAA AGTCCTTTTG TGGCTTTCCT AGTTACTCAA 
3951 ATTGACTTTC CCCCACCTTT GCACAGGTGC TTTCAATAGT TTTAAAATTA 
4001 TTTTTAAATA TATATTTTAG CTTTTTAATA AACAAAATAA ATAAATGACT 
4051 TCTTTGCTAT TTTGGTTTTG CAAAAAGACC CACTATCAAG GAATGCTGCA 
4101 TGTGCTATTA AAAATTGTTC CAAATGTCCA TAAATCTGAG ACTTGATGTA 
4151 TTTTTTCATT TTGTCCAGTG TTACCAACTA AATTGTG C AG TTTGGGGCTT 
4201 TTCCCCCTTA CCATAGAAGT GCAGAGGAGT TCAGTATCTC TGTTTTAAAG 
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4251 ACGTATAGAA TGAGCCCAAT TAAAGCGAAG GTGTTTGTGC TTGTTTGTGT 
4301 GTATCAGCTG TACCTTGTTG AGCATGTAAT ACATCCTGTA CATAAGAAAT 
435 1 TAGTTCTTTC CATGGCAAAG CTATTACCTT GTACGATGCT CTAATCATAT 
4401 TGCATTTAAT TTTATTTTGC ACAGTGACCT TGTAGCCACA TGAGAAAGCA 
445 1 CTCTGTGTTT TTGTTCGGTC TCAGATTTAT CTGGTTGAGT TGGTGTTTTG 
4501 TTTGGGGTTT TTAATTTTGC. GTGTTTGCAT AGCATAAAAT CAGTAGACAA 
455 1 CACCACTGAG GTCGTTACGA TCAACGATAT CCACAGTCTC TTTTTAGTCT 
4601 CTGTTACATG AAGTTTTATT CCAGTTACTT TTCATGGAAT GACCTATTTT 
465 1 GAAC AAGTAA TTTTCTTGAC AAGAAAGAAT GTATAGAAGT CTCCCTGCAA 
P 4701 TTAATTTCCA ATGTTTACAT TTTTTAACTA GACTGTGGAA TTTCTACAGA 
^3 475 1 TTAATATGAA ATGGAGCTCA TGGTCCGTTT GTGTGTTAGA TATGCTGTAG 
^4 4801 CTGAAGCCCT GTTTGTCTTT TAAACACTAG TTGGAAGCTC TCAATAAAAA 
Hj 4851 TGCCTGCTGC TCACAGCACA GAAAATGGGG CAGGGGGAGC CTCAAGCACA 
□ 490 1 ATCTAGCTGT CCTCCTAAAG ACTCTGTAAT GCTCACTCCC CTCGCGTTCT 

4951 CCCGGCGCTG TCGGGAGGCT GTGCTGGTGG TCGTGTAGAG GTCCTTCTCC 
LJ 5001 TTTCACATGG TGCAGAGAGC GAGGACCTCT CCTCCTCGTT CAGTTGCACT 
J* 505 1 TCAGTATTTT CACGGATATG AATGTAAAAT ATATAAATAT ATAAACCTGC 
5101 GGCTTTAACA ACTGTAATAC AACCTTTTGA ATTAGTTCCG TGTATAGATA 
h 5151 ATTAAATTCT TCATACAAAA GTTAAAAAAA AAAAAAAAAA AAAAA 
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Translated Protein Sequence a 1 1 

I MAQFPTPFGG SLD1WAITVE ERAKHDQQFH SLKPISGFIT GDQARNFFFO 
5 1 SCLPQPVLAQ IWALAONfNND GRMDQ VEFSI AMJCLIKLKLQ GYQLPSALPP 
101 VMKQQPVAIS SAPAFGMGGI ASMPPLTAVA PVPMGSIPVV GMSPTLVSSV 
151 PTAAVPPLAN GAPPVIQPLP AFAHPAATLP KSSSFSRSGP GSOLNTKLOK 
201 AQSFDVASVP PVAEWAVPQS SRLKYRQLFN SHDKTMSGHL TGPQARTTLM 
251 QSSLPQAQLA SIWNLSDIDQ DGKLTAEEFI LAMHLIDVAM SGQPLPPVLP 
301 PEYIPPSFRR VRSGSGISVI SSTSVDQRLP EEPVLEDEQQ QLEKKLPVTT 
35 1 EDKKRENFER GNLELEKRRQ ALLEQQRKEQ ERLAQLERAE QERKEREROE 
401 QERKRQLELE KQLEKQRELE RQREEERRKE IERREAAKRE LERQRQLEWE 
451 RNRRQELLNQ RNKEQEDIVV LKAKKKTLEF ELEALNDKKH QLEGKLQDIR 
501 CRLTTQRQE! ESTNKSRELR IAEITHLQQQ LQESQQMLGR UPEKQILND 
551 QLKQVQQNSL HRDSLVTUCR ALEAKELARQ HLRDQLDEVE KETRSKLQEI 
601 DIFNNQLKEL REIHNKQQLQ KQKSMEAERL fCQKEQERKI! ELEKQKEEAO 
651 RRAQERDKQW LEHVQQEDEH QRPRKLHEEE KLKREESVKK KDGEEKGKOE 
701 AQDKLGRLFH QHQEPAKPAV QAPWSTAEKG PLTISAQENV KWYYRALYP 
751 FESRSHDEIT tQPGDIVMVD ESQTGEPGWL GGELKGKTGW FPANYAEK1P 
801 ENEVPAPVKP VTDSTSAPAP KLALRETPAP LA VTSSEPST TPKNWADFSS 
85 i TWPTSTNEKP ETON WD A W AA QPSLTVPSAG QLRQRSAFTP ATATGSSPSP 
901 VLGQCEKVEC LQAQALYPWR AKKDNHLNFN KNDVITVLEQ QDMWWFGEVO 
F 8 } 951 GQKGWFPKSY VKL1SGPIRK STSMOSGSSE SPASL.KRVAS PAAKPWSGE 

1001 EFIAMYTYES SEQGDLTFQQ GDVILVTKKD GDWWTGTVGD KAGVFPSNYV 
"*y 1051 RLKDSEGSGT AGKTGSLGKK PEIAQVIASY TATGPEQLTL APGQLILXRK 

--j 1 101 KNPGGWWEGE LQARGKKRQI GWFPANYVKL LSPGTSKITP TEPPKSTALA 

^] AVCQVIGMYD YTAQNDDELA FNKGQIINVL NKEDPDWWKG EVNGOVGLFP 

Vi 1201 SNYVKLTTDM DPSQQ* 



y3 

yj 



whole protein sequence: 

1 EWRRQGRERS LVAP'YGGSR GRIPSGLRDG QRGGRGWCAG LRLLRPSQRR 
51 VSGTDLSLGR QRGPARR-GV D*QCKSNRTM AQFPTPFGGS LDIWAITVEE 
101 RAKHDQQFHS LKPISGFITG DQARNFFFQS GLPQPVLAQ! WALADMNNDG 
1 5 1 RMDQ VEFSIA MKLIKLKLQG YQLPSALPPV MKQQPVAISS APAFGMCGIA 
201 SMPPLTAVAP VPMGSIPWG MSPTLVSSVP TAAVPPLANG APPVIQPLPA 
251 FAHPAATLPK SSSFSRSGPG SQLNTKLQKA QSFDVASVPP VAEWAVPQSS 
301 RLKYRQLFNS HDKTMSGHLT GPQARTILMQ SSLPQAQLAS IWNLSDIDQD 
35 1 GKLTAEEFIL AMHLIDVAMS GQPLPPVLPP EYIPPSFRRV RSGSGISVIS 
401 STSVDQRLPE EPVLEDEQQQ LEKKLPVTFE DKXRENFERG NLELEKRRQA 
45! LLEQQRKEQE RLAQLERAEQ ERKERERQEQ ERKRQLELEK QLEKQRELER 
501 QREEERRKEI ERREAAKREL ERQRQLE WER NRRQEIXNQR NKJEQEDIWL 
551 KAKKKTLEFE LEALNDKKHQ LEGKLQDtRC RLTTQRQEIE STNKSRELRI 
601 AEITHLQQQL QESQQMLGRL IPEKQILNDQ LKQVQQNSLH RDSLVTLKRA 
651 LEAKELARQH LRDQLDEVEK ETRSKLQEID IFNNQLKELR EIHNKQQLQK 
701 QKSMEAERX.K QKEQERXHE LEKQKEEAQR RAQERDKQWL EHVQQEDEHQ 
75 1 RPRKLHEEEK LKREESVKKK DGEEKGKQEA QDKLGRLFHQ HQEPAKPAVQ 
80 1 APWSTAEKGP LTISAQENVK WYYRALYPF ESRSHDEITI QPGD1VMVDE 
85 1 SQTGEPGWLG GELKGKTGWF PANYAEFUPE NEVPAPVKPV TDSTSAPAPK 
90 1 LALRETPAPL A VTSSEPSTT PNNWADFSST WPTS7NEKPE TDNWDAWAAQ 
95 1 PSLTVPSAGQ LRQRSAFTPA TATGSSPSPV LCQGEKVEGL OAOALYPWRa 
1001 KWWHLNFNKNDVrrVLETO^ 

1051 TSMDSGSSES PASLKRVASP AAKPWSGEE F^A^^nr^^EjQGOL^^x^^ 
M01 DVIL VTKKDG DWWTGTVGDK AGVFPSNYVR L KDS EG^TAG KTGSLGKKP 
J2! JjJ^Q^^SYT A ^**^Q L TLA PGQUURKK NPGGWWEGEL QARGKKRQW?^ 

201 wfpanyvkllspgtskitpteppkstalaavcqvigmydytaw 

S! c22?'^ V ^ N KEDPDWWKGE VNGQVGLFPS NYVKLTTOMD K^^UCCP 
m? f^V^ GFPSYSN#LS PTFAQVLStV LKLFLN1YFS FLtNKJNKM. 
M . X^^. RPTIIC£CCMCY ' KLFQMSINLR LDVFFHFVQC YQLNCAVWGF 

15 XES^ V <5 YLCFKDV -N EPN-SEGVCA CLCVSAVPC* ACT4T5CT*EI 
I45t SSFHGKAITL YDALIILHLI LFCTVTfPH EKALCVFVRS QIYLVELVFC 
1501 LGFLILRVCI A*NQ*TTPLR SLRSTISTVS F'SIXHEVLF QLLFME'PIL 
1551 NK*FS*QERM YRSLPAINFQ CLHFLTRLWN FYRLCNGAH GPFVC*ICCS 
1601 'SPVCLLNTS WXLSl KM PA A HSTEN G AGG A SSTI * LSS • R LC^AHSPRVL, 
1651 PALSGGCAGG RVEVUXSHG AESEOLSSSF SCTSVFSRt- M^MYKPA 
1701 ALTTVtQPFELVPCIONMLHTKVKKKKKKK 
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1 CGGGGATGGT GTGCGGGGCT GCGGCTCCTG CGTCCCTCCC AGCGGCGCGT 
5 1 GAGCGGCACT GATTTGTCCC TGGGGCGGCA GCGCGGACCC GCCCGGAGAT 
10 i GAGGCGTCGA TTAGCAAGGT AAAAGTAACA GAACCATGGC TCAGTTTCCA 
151 ACACCTTTTG GTGGCAGCCT GGATATCTGG GCCATAACTG TAGAGGAAAG 
20 1 AGCGAAGCAT G ATC AGCAGT TCCATAGTTT AAAGCCAATA TCTGGATTCA 
25 1 TTACTGGTGA TCAAGCTAGA AACTTTTTTT TTCAATCTGG GTTACCTCAA 
301 CCTGTTTTAG CACAGATATG GGCACTAGCT GACATGAATA ATGATGGAAG 
35 1 AATGGATCAA GTGGAGTTTT CCATAGCTAT GAAACTTATC AAACTGAAGC 
401 TACAAGGATA TCAGCTACCC TCTGCACTTC CCCCTGTCAT GAAACAGCAA 
45 1 CCAGTTGCTA TTTCTAGCGC ACCAGCATTT GGTATGGGAG GTATCGCCAG 
501 CATGCCACCG CTTACAGCTG TTGCTCCAGT GCCAATGGGA TCCATTCCAG 
55 1 TTGTTGGAAT GTCTCCAACC CTAGTATCTT CTGTTCCCAC AGCAGCTGTG 
601 CCCCCCCTGG CTAACGGGGC TCCCCCTGTT ATACAACCTC TGCCTGCATT 
65 1 TGCTCATCCT GCAGCCACAT TGCCAAAGAG TTCTTCCTTT AGTAGATCTG 
701 GTCCAGGGTC ACAACTAAAC ACTAAATTAC AAAAGGCACA GTCATTTGAT 
75 1 GTGGCCAGTG TCCCACCAGT GGCAGAGTGG GCTGTTCCTC AGTCATCAAG 
801 ACTGAAATAC AGGCAATTAT TCAATAGTCA TGACAAAACT ATGAGTGGAC 
851 ACTTAACAGG TCCCCAAGCA AGAACTATTC TTATGCAGTC AAGTTTACCA 
901 CAGGCTCAGC TGGCTTCAAT ATGGAATCTT TCTGACATTG ATCAAGATGG 
95 1 AAAACTTACA GCAGAGGAAT TTATCCTGGC AATGCACCTC ATTGATGTAG 
1001 CTATGTCTGG CCAACCACTG CCACCTGTCC TGCCTCCAGA ATACATTCCA 
105 1 CCTTCTTTTA GAAGAGTTCG ATCTGGCAGT GGTATATCTG TCATAAGCTC 
1 101 AACATCTGTA GATCAGAGGC TACCAGAGGA ACCAGTTTTA GAAGATGAAC 
1151 AACAACAATT AGAAAAGAAA TTACCTGTAA CGTTTGAAGA TAAGAAGCGG 
1201 GAGAACTTTG AACGTGGCAA CCTGGAACTG GAGAAACGAA GGCAAGCTCT 
1251 CCTGGAACAG CAGCGCAAGG AGCAGGAGCG CCTGGCCCAG CTGGAGCGGG 
1301 CGGAGCAGGA GAGGAAGGAG CGTGAGCGCC AGGAGCAAGA GCGCAAAAGA 
1351 CAACTGGAAC TGGAGAAGCA ACTGGAAAAG CAGCGGGAGC TAGAACGGCA 
1401 GAGAGAGGAG GAGAGGAGGA AAGAAATTGA GAGGCGAGAG GCTGCAAAAC 
145 1 GGGAACTTGA AAGGCAACGA CAACTTGAGT GGGAACGGAA TCGAAGGCAA 
1501 GAACTACTAA ATCAAAGAAA CAAAGAACAA GAGGACATAG TTGTACTGAA 
155 1 AGCAAAGAAA AAGACTTTGG AATTTGAATT AGAAGCTCTA AATGATAAAA 
1601 AGCATCAACT AGAAGGGAAA CTTCAAGATA TCAGATGTCG ATTGACCACC 
1651 CAAAGGCAAG AAATTGAGAG CACAAACAAA TCTAGAGAGT TGAGAATTGC 
1701 CGAAATCACC CATCTACAGC AACAATTACA GGAATCTCAG CAAATGCTTG 
1 75 1 GAAG ACTTAT TCCAG AAAAA C AGATACTCA ATGACCAATT AAAACAAGTT 
1801 CAGCAGAACA GTTTGCACAG AGATTCACTT GTTACACTTA AAAGAGCCTT 
1 85 1 AGAAGCAAAA GAACTAGCTC GGCAGC ACCT ACGAGACCAA CTGGATGAAG 
1901 TGGAGAAAGA AACTAGATCA AAACTACAGG AGATTGATAT TTTCAATAAT 
195 1 CAGCTGAAGG AACTAAGAGA AATACACAAT AAGCAACAAC TCCAGAAGCA 
2001 AAAGTCCATG GAGGCTGAAC GACTGAAACA GAAAGAACAA GAACGAAAGA 
2051 TCATAGAATT AGAAAAAAAA AAAAAAAAA 
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#5 translated Protein sequence: 



I MAQFPTPFGG SLDIWAITVE ERAKHDQQFH SLKPISGFIT GDQARNFFFQ 
5 1 SGLPQPVLAQ IWALADMNND GRMDQVEFSI AMKLIKLKLQ GYQLPSALPP 
101 VMKQQPVAIS SAPAFGMGGI ASMPPLTAVA PVPMGSIPVV GMSPTLVSSV 
151 PTAAVPPLAN GAPPVIQPLP AFAHPAATLP KSSSFSRSGP GSQLNTKLQK 
201 AQSFDVASVP PVAEWAVPQS SRLKYRQLFN SHDKTMSGHL TGPQARTILM 
251 QSSLPQAQLA SIWNLSDIDQ DGKLTAEEFI LAMHLIDVAM SGQPLPPVLP 
301 PEYIPPSFRR VRSGSGISVI SSTSVDQRLP EEPVLEDEQQ QLEKKLPVTF 
q 3 5 1 EDKKRENFER GNLELEKRRQ ALLEQQRKEQ ERL AQLERAE QERKERERQE 
J3 40 1 QERKRQLELE KQLEKQRELE RQREEERRKE IERREAAKRE LERQRQLEWE 
S.i 45 1 RNRRQELLNQ RNKEQEDIVV LKAKKKTLEF ELEALNDKKH QLEGKLQDIR 
fi ! 501 CRLTTQRQEI ESTNKSRELR IAEITHLQQQ LQESQQMLGR LIPEKQILND 
n 551 QLKQ VQQNSL HRDSL VTLKR ALEAKEL ARQ HLRDQLDEVE KETRSKLQEI 
^ 601 DIFNNQLKEL REIHNKQQLQ KQKSMEAERL KQKEQERKII ELEKKKKK 



^! 1 RGWCAGLRLL RPSQRRVSGT DLSLGRQRGP ARR*GVD*QG KSNRTMAQFP 

P 5 1 TPFGGSLDIW AITVEERAKH DQQFHSLKPI SGFITGDQAR NFFFQSGLPQ 

HJ 10 i PVLAQIWALA DMNNDGRMDQ VEFSIAMKLI KLKLQGYQLP SALPPVMKQQ 

151 PVAISSAPAF GMGGIASMPP LTAVAPVPMG SIPVVGMSPT LVSSVPTAAV 
H 201 PPLANGAPPV IQPLPAFAHP AATLPKSSSF SRSGPGSQLN TKLQKAQSFD 

251 VASVPPVAEW AVPQSSRLKY RQLFNSHDKT MSGHLTGPQA RTILMQSSLP 
301 QAQLASIWNL SDIDQDGKLT AEEFILAMHL IDVAMSGQPL PPVLPPEYIP 
351 PSFRRVRSGS GISVISSTSV DQRLPEEPVL EDEQQQLEKK LPVTFEDKKR 
401 ENFERGNLEL EKRRQALLEQ QRKEQERLAQ LERAEQERKE RERQEQERKR 
45 1 QLELEKQLEK QRELERQREE ERRKEIERRE AAKRELERQR QLEWERNRRQ 
50 1 ELLNQRNKEQ EDIVVLKAKK KTLEFELEAL NDKKHQLEGK LQDERCRLTT 
551 QRQEIESTNK SRELRIAEIT HLQQQLQESQ QMLGRLIPEK QILNDQLKQV 
601 QQNSLHRDSL VTLKRALEAK ELARQHLRDQ LDEVEKETRS KLQEIDIFNN 
65 1 QLKELREIHN KQQLQKQKSM EAERLKQKEQ ERKIIELEKK KKK 
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1 GACCACCCAA AGGCAAGAAA TTGAG AGCAC AAACAAATCT AGAGAGTTGA 
5 1 GAATTGCCGA AATCACCCAT CTAC AGCAAC AATTACAGG A ATCTCAGCAA 
101 ATGCTTGGAA GACTTATTCC AGAAAAACAG ATACTCAATG ACCAATTAAA 
1 5 1 ACAAGTTC AG CAG AACAGTT TGCACAG AGA TTCACTTGTT ACACTTAAAA 
201 GAGCCTTAGA AGCAAAAGAA CTAGCTCGGC AGCACCTACG AGACCAACTG 
251 GATGAAGTGG AGAAAGAAAC TAGATCAAAA CTACAGGAGA TTGATATTTT 
301 CAATAATCAG CTGAAGGAAC TAAGAGAAAT ACACAATAAG CAACAACTCC 
35 1 AGAAGCAAAA GTCCATGG AG GCTGAACGAC TG A A AC AG A A AGAACAAGAA 
401 CGAAAGATCA TAGAATTAGA AAAACAAAAA GAAGAAGCCC AAAGACGAGC 
451 TCAGGAAAGG GACAAGCAGT GGCTGGAGCA TGTGCAGCAG GAGGACGAGC 
501 ATCAGAGACC AAGAAAACTC CACGAAGAGG AAAAACTGAA AAGGGAGGAG 
551 AGTGTCAAAA AGAAGGATGG CGAGGAAAAA GGCAAACAGG AAGCACAAGA 
60 1 CAAGCTGGGT CGGCTTTTCC ATC AACACCA AGAACCAGCT AAGCCAGCTG 
651 TCCAGGCACC CTGGTCCACT GCAGAAAAAG GTCCACTTAC CATTTCTGCA 
701 CAGGAAAATG TAAAAGTGGT GTATTACCGG GCACTGTACC CCTTTGAATC 
- 751 CAG AAGCCAT GATGAAATCA CTATCCAGCC AGGAGACATA GTCATGGTGG 

H% 801 ATGAAAGCCA AACTGGAGAA CCCGGCTGGC TTGGAGGAGA ATTAAAAGGA 

+ ? 851 AAGACAGGGT GGTTCCCTGC AAACTATGCA GAGAAAATCC CAGAAAATGA 

5 901 GGTTCCCGCT CCAGTGAAAC CAGTGACTGA TTCAACATCT GCCCCTGCCC 

Q 951 CCAAACTGGC CTTGCGTGAG ACCCCCGCCC CTTTGGCAGT AACCTCTTCA 

M 1 00 1 GAGCCCTCCA CGACCCCTAA TAACTGGGCC GACTTC AGCT CCACGTGGCC 

£3 1 05 1 CACC AGCACG A ATG AGA AAC C AGAA ACGGA TAACTGGG AT GCATGGGCAG 

31 j 1101 CCCAGCCCTC TCTCACCGTT CCAAGTGCCG GCCAGTTAAG GCAGAGGTCC 

C3 1151 GCCTTTACTC CAGCC ACGGC CACTGGCTCC TCCCCGTCTC CTGTGCTAGG 

U 1201 CCAGGGTGAA AAGGTGGAGG GGCTACAAGC TCAAGCCCTA TATCCTTGGA 

125 1 GAGCCAAAAA AGACAACCAC TTAAATTTTA ACAAAAATGA TGTCATCACC 
1301 GTCCTGGAAC AGCAAGACAT GTGGTGGTTT GGAGAAGTTC AAGGTCAGAA 
1351 GGGTTGGTTC CCCAAGTCTT ACGTGAAACT CATTTCAGGG CCCATAAGGA 
1401 AGTCTACAAG CATGGATTCT GGTTCTTCAG AGAGTCCTGC TAGTCTAAAG 
1451 CGAGTAGCCT CTCCAGCAGC CAAGCCGGTC GTTTCGGGAG AAGAAATTGC 
1501 CCAGGTTATT GCCTCATACA CCGCCACCGG CCCCGAGCAG CTCACTCTCG 
155 1 CCCCTGGTCA GCTGATTTTG ATCCGAAAAA AGAACCCAGG TGGATGGTGG 
1601 GAAGGAGAGC TGCAAGCACG TGGGAAAAAG CGCCAGATAG GCTGGTTCCC 
1651 AGCTAATTAT GTAAAGCTTC TAAGCCCTGG GACGAGCAAA ATCACTCCAA 
1701 CAGAGCCACC TAAGTCAACA GCATTAGCGG CAGTGTGCCA GGTGATTGGG 
1 75 1 ATGTACGACT ACACCGCGCA G AATGACG AT G AGCTGGCCT TC AACAAGGG 
1801 CCAGATCATC AACGTCCTCA ACAAGGAGGA CCCTGACTGG TGGAAAGGAG 
1 85 1 AAGTC AATGG ACAAGTGGGG CTCTTCCCAT CCAATTATGT GAAGCTGACC 
1901 ACAGACATGG ACCCAAGCCA GCAATGAATC ATATGTTGTC CATCCCCCCC 
195 1 TCAGGCTTGA AAGTCCTTTT GTGGCTTTCC TAGTTACTCA AATTGACTTT 
2001 CCCCC ACCTT TGCACAGGTG CTTTCAATAG TTTTAAAATT ATTTTTAAAT 
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2051 ATATATTTTA GCTTTTTAAT AAACAAAATA AATAAATGAC TTCTTTGCTA 
2101 TTTTGGTTTT GC A AAA AG AC CCACTATCAA GGAATGCTGC ATGTGCTATT 
2151 AAAAATTGTT CCAAATGTCC ATAAATCTGA GACTTGATGT ATTTTTTCAT 
2201 TTTGTCCAGT GTTACCAACT AAATTGTGCA GTTTGGGGCT TTTCCCCCTT 
2251 ACCATAGAAG TGCAGAGGAG TTCAGTATCT CTGTTTTAAA GACGTATAGA 
2301 ATGAGCCCAA TTAAAGCGAA GGTGTTTGTG CTTGTTTGTG TGTATCAGCT 
2351 GTACCTTGTT GAGCATGTAA TACATCCTGT ACATAAGAAA TTAGTTCTTT 
2401 CCATGGCAAA GCTATTACCT TGTACGATGC TCTAATCATA TTGCATTTAA 
2451 TTTTATTTTG CACAGTGACC TTGTAGCCAC ATGAGAAAGC ACTCTGTGTT 
»«j 2501 TTTGTTCGGT CTCAGATTTA TCTGGTTGAG TTGGTGTTTT GTTTGGGGTT 
fe s : 255 1 TTTAATTTTG CGTGTTTGCA TAGCATAAAA TCAGTAG ACA ACACCACTGA 
ti 2601 GGTCGTTACG ATCAACGATA TCCACAGTCT CTTTTTAGTC TCTGTTACAT 
j\ 265 1 GAAGTTTTAT TCCAGTTACT TTTCATGGAA TGACCTATTT TGAACAAGTA 
^ 2701 ATTTTCTTGA CAAGAAAGAA TGTATAGAAG TCTCCCTGCA ATTAATTTCC 
W 275 1 AATGTTTACA TTTTTTAACT AGACTGTGGA ATTTCTAC AG ATTAATATGA 
2801 AATGGAGCTC ATGGTCCGTT TGTGTGTTAG ATATGCTGTA GCTGAAGCCC 
m 285 1 TGTTTGTCTT TTAAACACTA GTTGGAAGCT CTCAATAAAA ATGCCTGCTG 
4* 2901 CTCACAGCAC AGAAAATGGG GCAGGGGGAG CCTCAAGCAC AATCTAGCTG 
s 295 1 TCCTCCTAAA GACTCTGTAA TGCTCACTCC CCTCGCGTTC TCCCGGCGCT 
Q 3001 GTCGGGAGGC TGTGCTGGTG GTCGTGTAAG GTCCTTCTCC TTTCACATGG 
M- 305 1 TGCAG AGAGC G AGG ACCTCT CCTCCTCGTT CAGTTGCACT TCAGTATTTT 
Q 3101 CACGG ATATG AATGTAAAAT ATATAAATAT ATA AACCTGC GGCTTTAAC A 
p j 3151 ACTGTAATAC AACCTTTTGA ATTAGTTCCG TGTATAG ATA ATTAA ATTCT 
Q 3201 TCATACAAAA GTTAAAAAAA AAAAAAAAAA A 
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#9 translated protein sequence: 



1 TTQRQEIEST NKSRELR1AE ITHLQQQLQE SQQMLGRLIP EKQILNDQLK 
51 QVQQNSLHRD SLVTLKRALE AKELARQHLR DQLDEVEKET RSKLQEIDIF 
101 NNQLKELREI HNKQQLQKQK SMEAERLKQK EQERKJIELE KQKEEAQRRA 
15 i QERDKQWLEH VQQEDEHQRP RKLHEEEKLK REESVKKKDG EEKGKQEAQD 
201 KLGRLFHQHQ EPAKPAVQAP WSTAEKGPLT ISAQENVKVV YYRALYPFES 
25 1 RSHDEITIQP GDIVMVDESQ TGEPGWLGGE LKGKTGWFPA NYAEKIPENE 
301 VPAPVKPVTD STSAPAPKLA LRETPAPLAV TSSEPSTTPN NWADFSSTWP 
35 1 TSTNEKPETD NWDA WAAQPS LTVPSAGQLR QRSAFTPATA TGSSPSPVLG 
40 1 QGEKVEGLQA QALYP WRAKK DNHLNFNKND VITVLEQQDM WWFGEVQGQK 
45 1 GWFPKSYVKL ISGPIRKSTS MDSGSSESPA SLKRVASPAA KPWSGEEIA 
501 QVIASYTATG PEQLTLAPGQ LILIRKKNPG GWWEGELQAR GKKRQIGWFP 
55 1 ANYVKLLSPG TSKITPTEPP KSTALAAVCQ VIGMYDYTAQ NDDELAFNKG 
601 QIINVLNKED PDWWKGEVNG QVGLFPSNYV KLTTDMDPSQ Q* 

Whole protein sequence 



1 TTQRQEIEST NKSRELRIAE ITHLQQQLQE SQQMLGRLIP EKQILNDQLK 
5 1 QVQQNSLHRD SLVTLKRALE AKELARQHLR DQLDEVEKET RSKLQEIDIF 
101 NNQLKELREI HNKQQLQKQK SMEAERLKQK EQERKIIELE KQKEEAQRRA 
151 QERDKQWLEH VQQEDEHQRP RKLHEEEKLK REESVKKKDG EEKGKQEAQD 
201 KLGRLFHQHQ EPAKPAVQAP WSTAEKGPLT ISAQENVKVV YYRALYPFES 
251 RSHDEITIQP GDIVMVDESQ TGEPGWLGGE LKGKTGWFPA NYAEKIPENE 
301 VPAPVKPVTD STSAPAPKLA LRETPAPLAV TSSEPSTTPN NWADFSSTWP 
351 TSTNEKPETD NWDA WAAQPS LTVPSAGQLR QRSAFTPATA TGSSPSPVLG 
401 QGEKVEGLQA QALYP WRAKK DNHLNFNKND VITVLEQQDM WWFGEVQGQK 
451 GWFPKSYVKL ISGPIRKSTS MDSGSSESPA SLKRVASPAA KPWSGEEIA 
501 QVIASYTATG PEQLTLAPGQ LILIRKKNPG GWWEGELQAR GKKRQIGWFP 
551 ANYVKLLSPG TSKITPTEPP KSTALAAVCQ VIGMYDYTAQ NDDELAFNKG 
601 QIINVLNKED PDWWKGEVNG QVGLFPSNYV KLTTDMDPSQ Q*IICCPSPP 
65 1 QA*KSFCGFP SYSN*LSPTF AQVLSIVLKL FLNIYFSFLI NKINK*LLCY 
701 FGFAKRPTIK ECCMCY*KLF QMSINLRLDV FFHFVQCYQL NCAVWGFSPL 
751 P*KCRGVQYL CFKDV*NEPN *SEGVCACLC VSAVPC*ACN TSCT*EISSF 
801 HGKAITLYDA LIILHLILFC TVTL*PHEKA LCVFVRSQIY LVELVFCLGF 
85 1 LILRVCIA*N Q*TTPLRSLR STISTVSF*S LLHEVLFQLL FME*PILNK* 
901 FS*QERMYRS LPAINFQCLH FLTRLWNFYR LI*NGAHGPF VC*ICCS*SP 
951 VCLLNTSWKL SIKMPAAHST ENGAGGASST I*LSS*RLCN AHSPRVLPAL 
1001 SGGCAGGRVR SFSFHMVQRA RTSPPRSVAL QYFHGYECKI YKYINLRL*Q 
1051 L*YNLLN*FR V*IIKFFIQK LKKKKK 
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